Morphological Tools for Six Small Uralic Languages

نویسنده

  • Attila Novák
چکیده

This article presents a set of morphological tools for six small endangered minority languages belonging to the Uralic language family, Udmurt, Komi, Eastern Mari, Northern Mansi, Tundra Nenets and Nganasan. Following an introduction to the languages, the two sets of tools used in the project (MorphoLogic’s Humor tools and the Xerox Finite State Tool) are described and compared. The article is concluded by a comparison of the six computational morphologies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computational Morphologies for Small Uralic Languages

This article presents a set of morphological tools for small Uralic languages. Various Hungarian research groups specialized in Finno-Ugric linguistics and a Hungarian language technology company (MorphoLogic) have initiated a project with the goal of producing annotated electronic corpora for small Uralic languages. The languages described include Mordvin, Udmurt (Votyak), Komi (Zyryan), Mansi...

متن کامل

Low-Resource Active Learning of Morphological Segmentation

Many Uralic languages have a rich morphological structure, but lack morphological analysis tools needed for efficient language processing. While creating a high-quality morphological analyzer requires a significant amount of expert labor, data-driven approaches may provide sufficient quality for many applications. We study how to create a statistical model for morphological segmentation with a ...

متن کامل

The developments, uses, and functions of preverbal particles in Hungarian and other Uralic languages

Within the Uralic language family, preverbal particles generally only occur within the Ugric branch of the Finno-Ugric languages, a fact known for some time (cf. Zsirai, 1933). Most Uralic scholars (who assume the existence of proto-Uralic) assume that preverbal particles are not a Uralic feature, that is, the existence of these particles in a handful of Uralic languages is due to innovations i...

متن کامل

Matti MiestaMo (Helsinki) POLAR INTERROGATIVES IN URALIC LANGUAGES A TYPOLOGICAL PERSPECTIVE

The paper surveys the domain of polar interrogation in the Uralic language family in a typological perspective. An overview of the ways in which polar interrogation is marked in the world’s languages is presented and the encoding of the domain in Uralic languages is examined against this background. All the major types of polar interrogative marking are found in the family. Polar interrogatives...

متن کامل

Languages under the influence: Building a database of Uralic languages

For most of the Uralic languages, there is a lack of systematically collected, consequently transcribed and morphologically annotated text corpora. This paper sums up the steps, the preliminary results and the future directions of building a linguistic corpus of some Uralic languages, namely Tundra Nenets, Udmurt, Synya Khanty, and Surgut Khanty. The experiences of building a corpus containing ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006